Time Series Modeling, Pt. 3: Seasonal-Trend Decomposition

2017-{07..08}, Josh Montague

This is Part 3 of the "Time Series Modeling in Python" series.

If you'd like to hop back to earlier sections, recall:

In Part 1, we looked at data structures within the pandas library that make working with time series particularly convenient.
In Part 2, we looked at the some of the simplest methods for modeling a time series, making forecasts, and evaluating the accuracy of our models.

In this section, we'll look at a common pattern for modeling time series data that takes greater advantage of prior observations in the series. Specifically, we're going to look at breaking down an observed time series into components that represent underlying trends and seasonality. Additionally, we'll write and use an "STL decomposition" because it's a nice interpretable model that doesn't exist in statsmodels and I think it should.

The general outline we'll follow is:

start with a motivating example ("start with the answer")
build our own version to see how it works
look at any alternative implementation

Consistent with earlier parts of the series, the OText content on forecasting is a great reference, and we'll follow some of the conventions there, particularly Chapter 6.5.



In [ ]:

    
import pandas as pd
import numpy as np
import scipy as sp
import statsmodels.api as sm
import matplotlib
import matplotlib.pyplot as plt

%matplotlib inline
matplotlib.rc('figure', figsize=(10,8))

Decomposing a time series

In the previous parts of this series, our modeling (and forecasting) efforts often ignored most of the data in our time series. Our predictions were either simply based on the last-observed point, or a single point some relatively small distance in the past.

Now, we'll consider larger-scale patterns in the time series data. Often these patterns are described in terms of a decomposition into "components." The set of components we'll refer to in this session is:

trend: long-term (low-frequency, large-period), low-variance changes in the data
seasonal: fixed-period (or frequency) fluctuations in the data (day, week, month, etc)
residual (or error): any difference remaining after the previous components are removed from observed data

We can write out the combined model as:

y = S + T + E, if the model is additive

y = S * T * E, if the model is multiplicative.

The additive model makes sense when the magnitude of the seasonal component doesn't vary with the absolute level of the series, and the multiplicative model makes sense when that variation is present. This is often an emperical choice, and the additive model is often the default setting.

The specific implementation of both the "trend" and "seasonal" components are variable, and we'll look at a couple of examples below.

Get some data

Like many libraries, statsmodels has some datasets built-in. This dataset represents daily observations of CO$_2$ in Hawaii over the span of forty years.



In [ ]:

    
# http://www.statsmodels.org/stable/datasets/index.html
dataset = sm.datasets.co2.load_pandas()

# special sm object ("bunch"-like cf sklearn datasets)
dataset



In [ ]:

    
# has a pd.dataframe in the .data attr
co2_ts = dataset.data

co2_ts.head()



In [ ]:

    
co2_ts.plot(title='Mauna Loa Weekly Atmospheric CO2 Data')

plt.xlabel('date')
plt.ylabel('CO2 concentration (ppmv)')

Right off the bat, we can identify some qualities of this data

✅ general "trend" (up and right)
✅ seasonality (wiggles that look yearly)
✅ missing data (some small gaps)

Since many timeseries models don't play nicely with missing data or NaNs, let's fill in the missing data with linear interpolations.



In [ ]:

    
# how much missing data?
len(co2_ts.isnull())



In [ ]:

    
# often, ts methods/libs don't play nicely with missing data/NaNs, so
#  fill missing values (recall pd timeseries methods from Part 1!)
co2_ts = (co2_ts
          .resample('D')
          .mean()
          .interpolate('linear')
         )          
          
co2_ts.head(10)



In [ ]:

    
co2_ts.plot()

# if you want a close-up
#co2_ts.head(1000).plot()

Seasonal decomposition with `statsmodels`

To motivate the work, I'll start with the answer that statsmodels provides out-of-the-box. Then we'll look under the hood to build some of it ourselves.



In [ ]:

    
from statsmodels.tsa.seasonal import seasonal_decompose



In [ ]:

    
# sm has recently been updated to accept pd.dataframes in more places (previously all numpy arrays)
seas_decomp = seasonal_decompose(co2_ts, model='additive', freq=365)

seas_decomp

Use stl.<TAB> above to see the attributes of this object. It's simple: a few arrays and a .plot() method.



In [ ]:

    
# the DecomposeResult doesn't expose very much for you, just arrays and a mpl.figure convenience method
fig = seas_decomp.plot()

This figure highlights the goal: take the observed data (top), identify the overall trend (a low-variance version of the observations, second chart), and identify the periodic component (called "seasonal," regardless of its frequency). Then, whatever is left after subtracting these from the observed data is the residual (bottom chart). Ideally, we're left with small-amplitude, Gaussian-ish residuals. In this example, it looks like relatively random, zero-centered jitters, at a level that is a fraction of a percent of the observations. That's good!

Note: inspecting the data for most significant periodicity is important! By default, statsmodels looks inside the pandas frame, and maps the frequency of the index to a filter frequency. Sounds reasonable, but here, the daily data leads to a filter function that is "7" e.g. a weekly smoothing. However, our data doesn't have weekly periodicity so the trend ends up overfit (see charts just below). We have to look at our data observe that the major periodicity is annual, so (above) we set the frequency to 365 (number in indices given the units of the observed data).



In [ ]:

    
# default settings -> read the pd.datetimeindex freq (1 day), assumes weekly cycles
seas_decomp = seasonal_decompose(co2_ts)

# overfit!
seas_decomp.plot();

Build it ourselves

Recall, we have basically two goals:

calculate and subtract trend (smooth)
calculate and subtract seasonality

Trend component

The seasonal_decompose method isn't technically an "STL" (Seasonal Trend with Lowess) decomposition because it doesn't use the Loess method (more on this soon), but it does decompose the signal into a seasonal and trend component. The statsmodels function uses a convolution between the observed data and a filter (roughly, a square wave) defined by the specified frequency of the seasonal component.

The method will introspect a dataframe for the frequency of it's datetimeindex, but you may have to manually override it. If you have a month of daily sales data, the largest seasonal component is probably weekly. In our example, we have years of daily weather data and the biggest seasonal component is the annual one (365). You could imagine using a Fourier Transform to identify the largest seasonal component automatically, but this is left as an exercise for the reader.

So what is a convolution? Generally it's an operation that takes two functions, and produces another. Roughly, it's the integral of the pointwise multiplication of the two functions as a function of the amount that one of the original functions is translated. I always remember it best as "smearing one curve across another while dragging it from left to right." I find this figure (from the wiki page) helpful.

We don't have to go so far as writing our own convolution. This one from statsmodels is built around the relevant tools from scipy. Below, I'll follow, briefly, what the seasonal_decompose() method is doing under the hood. Don't get too hung up on the concept of the convolution; it's just one way to smooth the observed data into a trend that happens to account for variation over a particular window size. We'll look at another example shortly.

We'll calculate a few things, then plot them all together for comparison.



In [ ]:

    
from statsmodels.tsa.filters.filtertools import convolution_filter



In [ ]:

    
# the filter size (units = # of data points)
freq = 365

# an array that is a square wave from 0 to freq
filt = np.repeat(1./freq, freq)
#filt



In [ ]:

    
# number of data points to include in inspection charts
nobs = 2000



In [ ]:

    
# convert our dataframe of observations to 1-d np.array
observed = np.asanyarray(co2_ts).squeeze()

observed[:nobs]



In [ ]:

    
# calculate the "trend" array by dragging filter across data
trend = convolution_filter(observed, filt)

#trend[:nobs]



In [ ]:

    
# subtract the trend array ("standardize" values)
detrended = observed - trend

# these values should be closer to 0
detrended[:nobs]



In [ ]:

    
fig, axes = plt.subplots(31, sharex=True)

# filter
plt.subplot(311)
# note: i'm artificially adding a bunch of 0s to the filter here so we can see it on the same x scale
plt.plot(np.append(filt, np.array([0]*(nobs - len(filt)))), '.-', label='filter')
plt.title("first {} observed data points".format(nobs))
plt.legend()

# trend
plt.subplot(312)
plt.plot(observed[:nobs], '.-', label='observed')
plt.plot(trend[:nobs], '.-', label='trend')
plt.legend()

# standardized / de-trended
plt.subplot(313)
plt.plot(detrended[:nobs], '.-', label='detrended')
plt.legend()

Seasonal component

Now we use the detrended series to find the seasonal component. In statsmodels, they do this with an interesting empirical method. They calculate the mean value for reach of the freq points at indexes with the specified periodicity. That is, the mean across the values at index 0, 365, 729, etc., the mean across the values at index 1, 366, 730, etc., and so on, until we've covered each of the freq e.g. 365 possible means.



In [ ]:

    
# special handling of mean wrt NaN values
from pandas.core.nanops import nanmean as pd_nanmean



In [ ]:

    
# calculate mean value in each point of the seasonal cycle (defined by freq)
#  e.g. each day in the annual cycle
period_averages = np.array([pd_nanmean(detrended[i::freq]) for i in range(freq)])

# center average cycle 
period_averages -= np.mean(period_averages)

len(period_averages)



In [ ]:

    
plt.plot(period_averages, '.-', label='seasonal cycle')

plt.xlabel('point in cycle')
plt.ylabel('mean value')
plt.legend();



In [ ]:

    
# tile out the seasonal curve - this becomes the seasonal part of the model 
seasonal = np.tile(period_averages, len(observed) // freq +1)[:len(observed)]

#len(seasonal)



In [ ]:

    
plt.plot(seasonal[:nobs], '.-')



In [ ]:

    
# the residual is what's left over after the seasonal component 
#  is subtracted from the detrended data
resid = detrended - seasonal

Now we have all the pieces we need to construct the seasonal_decompose output: trend and seasonal components, and the residual measurements.



In [ ]:

    
# decomposition details (first `nobs` points)
fig, axes = plt.subplots(41, sharex=True)

# trend
plt.subplot(411)
plt.plot(observed[:nobs], '.-', label='observed')
plt.plot(trend[:nobs], '-', label='trend')
plt.title('decomposition on first {} points'.format(nobs))
plt.legend()

# detrended
plt.subplot(412)
plt.plot(detrended[:nobs], '-', label='detrended')
plt.legend()

# seasonal
plt.subplot(413)
plt.plot(seasonal[:nobs], '-', label='seasonal')
plt.legend()

# residual
plt.subplot(414)
plt.plot(resid[:nobs], '-', label='residual')
plt.legend()



In [ ]:

    
# all data points 
fig, axes = plt.subplots(41, sharex=True)

# trend
plt.subplot(411)
plt.plot(observed, '-', label='observations')
plt.plot(trend, '-', label='trend')
plt.title('decomposition on all points')
plt.legend()

# detrended
plt.subplot(412)
plt.plot(detrended, '-', label='detrended')
plt.legend()

# seasonal
plt.subplot(413)
plt.plot(seasonal, '-', label='seasonal')
plt.legend()

# residuals
plt.subplot(414)
plt.plot(resid, '-', label='residual')
plt.legend()



In [ ]:

    
# look at the distribution of residuals ( hist() doesn't like NaNs )
plt.hist([ x for x in resid if not np.isnan(x) ], bins=100);

plt.xlabel('residual')
plt.ylabel('count')
plt.title('distribution of residuals')

If the residual distribution was really skewed, we might consider going back to our assumptions of seasonal frequency, and possibly modify our filter (in the convolutional example). At the very least, it would set up our expectations for how any predictions would over- or under-estimate real values.

The seasonal_decompose method in statsmodels is perfectly fine, but I've long wanted statsmodels to have a proper STL decomposition. There are many libraries full of advanced methods for time series forecasting in Python (and we'll likely see some later in this series!), but not as many that are simple. I value simplicity and interpretability, both of which I think STL embodies. Looking at the functionality of R's forecast library can be a little depressing for Pythoneers in this regard (Hyndman is prolific).

So, if it doesn't exist? Let's just make it!

Lowess & STL decomposition

The STL (Seasonal-Trend with Lowess) decomposition was published in 1990 by Cleveland, as an application of the Lowess (locally weighted scatterplot smoothing) method published by the same author in 1979. statsmodels doesn't have an out-of-the-box STL decomp method, but it does have an implementation of the Lowess smoother.

Lowess is generally a super handy way of highlighting trends in noisy data. It's an underutilized tool (I think) that also shows up in e.g. ggplot, and now seaborn. Before we use it in a decomposition, let's make sure we understand what it's doing by itself...



In [ ]:

    
lowess = sm.nonparametric.lowess



In [ ]:

    
# make some noisy data
nobs = 200

x = np.linspace(0,nobs-1,nobs)

# sine-ish
#y = 0.02*x + np.sin(0.25*x) + np.random.normal(scale=0.25, size=len(x))

# linear-ish
y = 0.02*x + np.random.normal(scale=0.75, size=len(x))



In [ ]:

    
plt.plot(y, 'o')

Unfortunately, statsmodels has a different API philosophy than, say, sklearn. So the (exog, endog) function signature can take some getting used to if you're used to seeing the (X, y) signature pattern. But, ultimately, they just take some time to translate in your mind.

There are a couple of parameters in a Lowess model, but the only one that affects the fitting is frac which controls how much adjacent data to use in calculating any one point in the smoothed curve. frac is the fraction (of the entire array of data) adjacent to the relevant point that is used.

The regression is locally weighted - meaning the importance of some y values is modified - by a function that has a form roughly between a short-tailed Gaussian and a rounded square wave.

Let's compare output, varying the settings.



In [ ]:

    
# default frac=0.66
z = lowess(y, x, return_sorted=False)

# compare to a couple extreme values of frac
frac1 = 0.1
frac2 = 0.9

w = lowess(y, x, frac=frac1, return_sorted=False)
v = lowess(y, x, frac=frac2, return_sorted=False)



In [ ]:

    
plt.plot(y, '.', label='obs')

plt.plot(z, '--', label='frac=0.6 (default)')
plt.plot(w, '--', label='frac={}'.format(frac1))
plt.plot(v, '--', label='frac={}'.format(frac2))

plt.legend()

Note: the importance of the frac kwarg changes with the number of data points. Small N (total observations) means a much higher variance model than large N for the same frac value.

Drop in for convolution

This time, I'll skip the drawn out version we used previously and drop the Lowess smoothing in as the trend calculation in a decomposition. I'm also going to import some utilities directly from statsmodels. We'll talk through them, and if you're curious for more, you can poke around in the statsmodels source code.



In [ ]:

    
def stl_decompose(df, freq=365, lo_frac=0.6, lo_delta=0.0):
    """
    An STL decomposition of time series data.
    
    This implementation is modeled after the statsmodels seasonal_decompose method
    but substitutes Lowess for convolution in its trend estimation.
    
    Parameters
    ----------
    df : pd.Dataframe
        Time series of counts with a DatetimeIndex.
    freq: int, optional
        Most significant periodicity in the time series.
    lo_frac: float, optional
        Fraction of data to use in Lowess smoothing
    lo_delta: float, optional
        Fractional distance within which to use linear-interpolation 
        instead of weighted regression. Significantly affects 
        computation time!

    Returns
    -------
    results : obj
        A object with seasonal, trend, and resid attributes.

    Notes
    -----
    This is an additive model, Y[t] = T[t] + S[t] + e[t]        
    """
    # use some existing pieces of statsmodels
    from statsmodels.tsa.seasonal import DecomposeResult
    from statsmodels.tsa.filters._utils import _maybe_get_pandas_wrapper_freq
    import statsmodels.api as sm
    from pandas.core.nanops import nanmean as pd_nanmean
    
    lowess = sm.nonparametric.lowess
    _pandas_wrapper, _ = _maybe_get_pandas_wrapper_freq(df)
    
    # get plain np array
    observed = np.asanyarray(df).squeeze()
    
    # calc trend, remove from observation
    trend = lowess(observed, [x for x in range(len(observed))], 
                   frac=lo_frac, 
                   delta=lo_delta*len(observed),
                   return_sorted=False)
    detrended = observed - trend
    
    # calc one-period seasonality, remove tiled array from detrended
    period_averages = np.array([pd_nanmean(detrended[i::freq]) for i in range(freq)])
    # 0-center the period avgs
    period_averages -= np.mean(period_averages)
    seasonal = np.tile(period_averages, len(observed) // freq + 1)[:len(observed)]    
    resid = detrended - seasonal
    
    # convert the arrays back to appropriate dataframes, stuff them back into 
    #  the statsmodel object
    results = list(map(_pandas_wrapper, [seasonal, trend, resid, observed]))    
    dr = DecomposeResult(seasonal=results[0],
                         trend=results[1],
                         resid=results[2], 
                         observed=results[3],
                         period_averages=period_averages)
        
    return dr



In [ ]:

    
co2_ts.head()



In [ ]:

    
# use just like the seasonal_decompose method
co_stl = stl_decompose(co2_ts, freq=365, lo_frac=0.1)



In [ ]:

    
# we inherit a .plot() method from DecomposeResult
co_stl.plot();



In [ ]:

    
# or make your own custom figure with the internal df attributes
fig, axes = plt.subplots(41)

# obs + trend
plt.subplot(411)
plt.plot(co_stl.observed, '-', label='x')
plt.plot(co_stl.trend, '-', label='Lowess trend')
plt.legend()

# seasonal
plt.subplot(412)
plt.plot(co_stl.seasonal, '-', label='seasonal')
plt.legend()

# residuals
plt.subplot(413)
plt.plot(co_stl.resid, '-', label='residual')
plt.legend()

# resid hist
plt.subplot(414)
plt.hist(co_stl.resid.values, bins=100)
plt.xlabel('residual')
plt.ylabel('count')

We can also pass through an extra kwarg that the Lowess method supports which allows us to not fit a regression for every point, but to linearly interpolate any point that's within some distance of the last one calculated. This lets us trade some accuracy for less runtime.



In [ ]:

    
# this will take ~1-2 minutes
%timeit -n3 -r3 stl_decompose(co2_ts, lo_frac=0.25)



In [ ]:

    
# don't recalc regression w/in 1% windows 
%timeit -n3 -r3 stl_decompose(co2_ts, lo_frac=0.25, lo_delta=0.01)

Forecasting with seasonal decomposition

Now that we have a decomposition, it'd be great if we could project it forward. Continuing to follow the methods as before, we can consider the components of the decomposition:

y = S + T + E,

forecast them separately and then combine them appropriately. For now, we're only going to consider additive models, and we're not going to incorporate E (maybe later!).

We'll start by forecasting the trend component. In the previous session, we defined a set of simple one-point-ahead forecasting methods that we can reuse here. We can import them from the module included in this repo.

The workflow we aim for is:

decompose an observed time series using the STL decomposition above
create a forecast of the observed series (of user-specified length) using the decomposition components, and a chosen forecasting function



In [ ]:

    
# some methods from earlier RST
from rst_timeseries import fc_naive, fc_drift



In [ ]:

    
def forecast_stl(stl, steps=10, seasonal=False, fc_func=None, **fc_func_kwargs):
    """
    Create an (additive) forecast, using the given forecasting function, 
    the specified number of steps forward, including optional seasonality, 
    given an STL decomposition of an observed time series.
    
    
    Parameters
    ----------
    stl : an extended statsmodels.tsa.seasonal.DecomposeResult
        STL decomposition of observed time series
    steps: int, optional
        Number of forward steps to include in the forecast
    seasonal: boolean, optional
        Include seasonal component in forecast
    fc_func: function
        Forecasting function which takes an array of observations and returns a single
        valued forecast for the next point
    fc_func_kwargs: keyword arguments
        All remaining arguments are passed to the forecasting function fc_func

    Returns
    -------
    forecast_frame : pd.DataFrame
        A pandas dataframe with timeseries index corresponding to the specified 
        forecast duration

    Notes
    -----
    This is an additive model, Y[t] = T[t] + S[t] + e[t]    
    
    """
    ## container for forecast values
    forecast_array = np.array([])
    
    ## forecast trend
    # unpack precalculated trend array stl frame
    #trend_array = np.asanyarray(stl.trend).squeeze()
    trend_array = stl.trend
    
    # iteratively forecast trend ("seasonally adjusted") component
    for step in range(steps):
        # make this prediction on all available data
        pred = fc_func(np.append(trend_array, forecast_array), **fc_func_kwargs)
        # add this prediction to current array
        forecast_array = np.append(forecast_array, pred)
   
    # forecast index starts one unit beyond observed series
    ix_start = stl.observed.index[-1] + pd.Timedelta(1, stl.observed.index.freqstr)     
    forecast_idx = pd.DatetimeIndex(freq=stl.observed.index.freqstr,
                                    start=ix_start, 
                                    periods=steps)

    ## (optionally) forecast seasonal & combine 
    if seasonal:
        # track index and value of max correlation
        s_ix = 0
        max_correlation = -np.inf
        
        # loop over indexes=length of period avgs
        detrended_array = np.asanyarray(stl.observed - stl.trend).squeeze()
        for i,x in enumerate(stl.period_averages):
            # work slices backward from end of detrended observations
            if i == 0:
                # slicing w/ [x:-0] doesn't work
                detrended_slice = detrended_array[-len(stl.period_averages):]        
            else:
                detrended_slice = detrended_array[-(len(stl.period_averages) + i) : -i]
            # calculate corr b/w period_avgs and detrend_slice
            this_correlation = np.correlate(detrended_slice, stl.period_averages)[0]
            if this_correlation > max_correlation:
                # update ix and max correlation
                max_correlation = this_correlation
                s_ix = i
        # roll seasonal signal to matching phase
        rolled_period_averages = np.roll(stl.period_averages, -s_ix)
        # tile as many time as needed to reach "steps", then truncate
        tiled_averages = np.tile(rolled_period_averages, 
                                 (steps // len(stl.period_averages) + 1))[:steps]
        # add seasonal values to previous forecast
        forecast_array += tiled_averages                
    
    # combine data array with index into named dataframe
    forecast_frame = pd.DataFrame(data=forecast_array, index=forecast_idx)
    forecast_frame.columns = [fc_func.__name__]            
    return forecast_frame

To demonstrate how this works, let's start by using the same dataset, but assume that we have only observed a fraction of it when we want to make our forecast.



In [ ]:

    
# take a fraction to forecast
co2_ts_short = co2_ts.head(10000)

plt.plot(co2_ts, '--', label='full')
plt.plot(co2_ts_short, '-', label='short')
plt.legend();

First, we'll fit the STL model to our observed data set, and use the same daily frequency as before, plus some of the Lowess settings we looked at for efficiency.



In [ ]:

    
# model the abbreviated series with our STL decomposition
stl = stl_decompose(co2_ts_short, freq=365, lo_frac=0.25, lo_delta=0.01)

# note the last date of the observed frame 
# and compare to first date in forecast frame
co2_ts_short.tail()

Note the final data point in our observed (and modeled) time series. Our forecast should pick up right after this point.

We'll forecast out model out 1000 steps, which will cover a few periods of our data. We'll also use the naive "drift" forecast. Recall from the last session that this makes it's next-point prediction based on a linear fit of the last observed points and the point n steps back (default n=3).



In [ ]:

    
# now forecast forward `steps` values, with optional seasonality, and 
#  using one of our imported forecasting functions
fc_df = forecast_stl(stl, steps=1000, seasonal=True, fc_func=fc_drift)
fc_df.head()



In [ ]:

    
# compare the forecast to our observations, ground truth, and the modeled trend
plt.plot(co2_ts, ':', label='ground truth')
plt.plot(co2_ts_short.tail(12000), label='obs')
plt.plot(stl.trend, '--', label='stl.trend')
plt.plot(fc_df, '.-', label='forecast')

# zoom way in on the relevant region
#plt.xlim('1980','1993'); plt.ylim(330,370);

plt.legend();

🙌🏼 🙌🏼 🙌🏼

Just to be sure it works, let's try it on another time series...

The small file boulder.csv included in misc/ is an approximate daily count of the appearances of "boulder" on Twitter between April and mid-August.



In [ ]:

    
bdr_counts = (pd.read_csv('misc/boulder.csv', parse_dates=[0])
              .set_index('dt')
             )
bdr_counts.head()

As before, we should fill NaNs, but in this demo there aren't actually any NaNs. However, this is still important because the resample method also ensures that our DatetimeIndex also has the right freq attribute we need in our STL method.



In [ ]:

    
# fill NaNs (in this case, but also ensure that the df has freq='D' set
bdr_counts = (bdr_counts
              .resample('D')
              .mean()
              .interpolate('linear')
             )     

#bdr_counts.index



In [ ]:

    
bdr_counts.plot()

plt.ylabel('daily "boulder" counts')



In [ ]:

    
len(bdr_counts)



In [ ]:

    
bdr_short = bdr_counts.head(100)



In [ ]:

    
# it seems the weekly cycle is biggest and data is daily, so freq=7
stl = stl_decompose(bdr_short, freq=7, lo_frac=0.4, lo_delta=0.01)



In [ ]:

    
stl.plot();



In [ ]:

    
stl.resid.hist(bins=50);

These residuals are certainly larger than the ones we had with our first data set.



In [ ]:

    
bdr_fc = forecast_stl(stl, steps=50, seasonal=True, fc_func=fc_drift)

# plot 
plt.plot(bdr_counts, '--o', label='truth')
plt.plot(bdr_short, '--o', label='observed')
plt.plot(bdr_fc, '-o', label='forecast')
plt.plot(stl.trend, ':', label='trend')

plt.ylabel('daily "boulder" counts')
plt.xlabel('date')
plt.legend();

# zoom into forecast region
#plt.xlim('2017-05-15','2017-09'); plt.ylim(1500,3500)

Not bad!

That definitely isn't as beautiful and obvious as the first data set. But it's also a good example of where the seasonal component is not the overwhelmingly dominant feature. In this case, the daily (and weekly) fluctuations are of the same size (or bigger than) the modelled seasonal component.

Another thing to observe with this model is the effect of the lo_frac term - that is, how narrow the bandwidth is used in the Lowesss trend. If you turn it down to ~0.3, you'll see the last few points trending up and right, leading the model (which, recall is a linear extrapolation of the fit Lowess trend) to diverge quite a lot from the observed data.

So, sadly (but hopefully not surprisingly) there is no silver bullet! But, with some tuning and observation, this is one relatively straightforward method of forecasting a time series.

Wrap-up

In summary, STL decompositions are pretty handy, and relatively efficient.

Pro

interpretable
flexible and robust
arbitrary timescales and seasonality
takes advantage more of the observations

Con

doesn't account for transient events
can be inefficient
still requires an expert to identify most important frequency, and Lowess bandwidth



In [ ]: